>> Good morning, everybody. >> Good morning. >> Nice to start with a good morning. This is all about community, this weekend. I'm Alan, I'm the lead photographer and also working on a Ph.D. at the University of British Columbia in Vancouver. I have notes on my laptop, but I'm using the presentation computer. So if I ever seem like I'm not moving any slides ahead, please let me know. So this presentation is something I'm working on as part of my dissertation research. Most of my research is looking at the past of OpenStreetMap, looking at the history of plan file of our data structures that we can study and see what happened over time. And mostly interested in what that tells us about the community, about the people that make up OSM. What can we tell from that footprint, that fingerprint of how we edit OSM, what that looks like in the database of the things that are edit, the types of things, who are the editors? Are they the same people who create a reason, the same ones who are modifying it later? Or people who added all of these roads the beginning of OSM do a lot of it move on and new people come in later? So these are the questions I'm trying to study. But for this presentation, I'm going to jump off of that and move a little bit into the future of OSM. So, like, what do these charts that we see all the time with the ever increasing node count in OSM. If we really look at the data, this raw statistical numbers, what does that tell us about not only the people behind these charts, but what's going on in the future? What might go on in the future? And one of the reasons that's important is not only because we should be preparing our community, preparing the tools, preparing the structures from what that feature might look like, and what we want it to look like. But also what we imagine this future might be tells a lot about what types of editing activity, what types of data we value in OSM today. We all probably have different imaginations of what OSM should become and so part of what I'm going to try to do is, like, think what are some of those possible futures? Based on what we can see in the past of OSM. So, yeah. Let's -- we have some charts about people to the number of active users. You will probably remember some time last year OSM hit two million registered user accounts, which is amazing. But once you dig behind numbers like that, you realize that, well, most of those two million, they registered for an account, when they but they never actually he had the database. And then the number of people who are actually editing on a on going basis who have been active each month is much smaller. So these are numbers from the beginning of this year. So that it's a little bit higher in the Wikimedia keynote earlier we heard is around 3,500 a month. And, in fact, this is actually really great, I'm excited to get to follow that keynote this morning because in my research, I'm trying to make a lot of parallels with Wikipedia to see what we can learn from how the Wikipedia community evolved also to use some concepts and theories from Wikipedia research and see if we can apply those to OSM. So here's the growth of Wikipedia articles. I think these are -- yeah, the English Wikipedia. So for all the English language articles. And it's increasing. Still going up. But if you look at the curve, the rate of increase is slowly decreasing. The number of articles added each year is less than the number that was added the year before. And here's also those numbers that was talked about this morning. The number of -- the pink line is the number of Wikipedians that make at least five edits per month. That number has been dropping since about 2007. And whether or not these metrics really the best way to judge what's going on in the community is something should also debate. Talked about using capture, a really strong local community, but when everyone stays together every month and making one -- adding one page. But these are still fairly good metrics to show that there is a decrease in the amount of activity. And those power users, the one that make 25 edits a month or 100 edits a month, they're not dropping quite as fast. What's happening in Wikipedia is a smaller number of people are doing more and more of the work. It's like a lot of these long tailed distributions that we see in Internet communities. Like, there's a few people who are doing the bulk of all the editing and that's true in OSM as well. The thing with Wikipedia, nobody quite knows why this is happening. And for the last almost ten years now, the people who are also trying to facilitate Wikipedia's growth and studying it, they're all freaking out. Why is this happening? Does this mean there's something fundamentally wrong? Some fundamental problem that is going to cause Wikipedia to die out or crash or something like that over time? So there's been a lot of theories and a lot of attempts to how do we address this problem? And one of the possibilities as we heard mentioned is that there is a lot of still bad people in the community that are driving away new people and for those of you who maybe have run into those problems in OSM that may hit a little bit close to home. We need to make sure that we are being as inclusive as possible and making sure that new users aren't being driven away by the first encounter of a prickly personality that has been around in Wikipedia or OSM for a long time. Wikipedia has a rule or guideline called notability rule. And Wikipedia basically says an article has to be about a notable topic to be included. Someone can map every tree but in Wikipedia a tree has to have citations, and this is one of the reasons why the rate of articles are going down and this is why people who perhaps delete an article because it's not notable maybe they never come back. So this has been a debate that happened fairly early in the years of Wikipedia. And it got to the point where there were these two fashions called inclusionist and deletionists. And some of these are sarcastic the fact that they created Wikipedia paging for themselves. But everyone agrees that the deletionists won that you can't just arbitrarily keep adding new pages of whatever you want in Wikipedia. The deletionist won in that the notably is pretty strongly enforced. There's still new things to add articles about, but it's not growing that fast, and that might be one of the reasons that OSM is different. We do not have notability rule. Basically an arbitrary amount of detail is possible and to some extent condoned. You can add any tree you want, as many trees as you want. But the problem is somebody is going to have to maintain it. What's that going to look like? That was one of the main arguments of the deletionists in Wikipedia. If you let people have an article about anything, you have so many articles, and functionally you can't have people maintaining them. To the degree you need them to be maintained. So in my research, I'm trying to think about this idea of meaningless in OSM, what does that look like? What do we call it? The fingerprint of maintenance in OSM data. And borrowing a concept from Wikipedia research called wiki gardeners. They sometimes use the word like a wiki gnome. There's someone who fixes grammar, fixes broken links, somebody who gets off on maintaining the data, these articles. And luckily there's enough people in the world who do that, who enjoy that Wikipedia maintains really high quality articles that are usually pretty clean, usually there's not that many broken links. You need to have enough people who enjoy that type of -- those unglamorous, behind the scene tasks to maintain something like that. So what might that look like in OSM? I'm trying to come up with this idea of map gardening, what would that look like? What is map gardening in OSM? What are those editing tasks that people might do to keep OSM going? The things that happened after that final trail blazing phase of mapping all the streets in your neighborhood, most of us have not had a chance to do that. Because when you joined OSM, our neighborhood was already on the map. And most of the new people who join OSM move forward in that same spot. So how do we make sure that OSM is a healthy community but has gardening? Has people who enjoy maintenance and those tasks are valued? Okay. So let me step back a bit, way back to the origin of the universe. [Laughter] And I'm going to try to pull some cosmological metaphors out of the growth of the universe and try to come up with similar ideas and apply them to OSM and see what that would look like. I'm just going to let you look at this for a second while I take a drink because we're looking at the beginning of the universe. [Laughter] So when I grew up thinking about science and astronomy, we knew the universe was expanding from the Big Bang. Nobody quite knew if it was going to have enough gravity to retract and become the big crunch or if it was going to expand forever. In more recent years, they've actually discovered that the universe is expanding even faster. Something is making it accelerate. So now instead of, like, just the big crunch we have to worry about, we might have something called the big rip. So if this dark energy out there is forcing things apart faster and faster, the universe might just kind of tear apart every molecule and particle will be millions of light years away from each other. How the particles relate to each other, we don't know how it's going to play out, but we can imagine various scenarios. So in OSM, these are four possible scenarios, and maybe there's more that seem to me like they might be analogs. And I'll talk about which four for the rest of the talk. Basically the ratio of adding features and editing those features, ratios of growth of the community versus -- are these all different factors that will go into the cosmological feature of OSM. And how might we look at a chart that would show these kinds of things? So this is with made up data, and I'll show you real data in a moment. But time is not on either of these axes. What we have here is the number of nodes that we create along the bottom going to the right. And then we have the number of edits, modified notes. So I'm basically looking at the OSM history file, I'm looking at every feature and every node actually. I'm ignoring ways -- I'm ignoring relations just to make it simpler. And I'm seeing what version number each node is at when did it get created and when did it get modified? So every time there's a created node in the database, I'm adding to this number along the bottom. It's cumulative. If the number of nodes started going down, people started deleting that, I wouldn't capture that here, because I just think looking at every time somebody adds something, it moves to the right. Every time somebody modifies a node, our chart goes to the top. And so then these dots show where the total is at each year, the end of each year. So hypothetically we should see a certain number of new nodes happening and certain number of edit nodes happening and this chart will sort of wiggle up to the right as people add more nodes. And so these are also some of those scenarios, again, with fake data, what that might look like. The first scenario I'm calling the ghost town is if the dots start slowing down, people add less and less things each year, and they modify less and less things each year. So this would happen if maybe our community becomes more and more toxic to new people. People start leaving. Maybe our data model and our tools become too complex. Maybe everyone just gets bored. But eventually people just leave the community and we're left with a ghost town. It wouldn't of course look like this. This is just the first result for ghost town in OSM, but it would look like a complete map we had today but imagine everyone just walked away and ten years down the road, you would have a map that looks complete except no one would have updated it in ten years, it would be becoming progressively obsolete. I don't think this is going to happen, but it could. And it's one of the possible scenarios we might be facing. Scenario number two is if we slowly decrease the number of new things that we're adding, but we keep editing them, keep modifying them. This would be if we instituted in OSM a notability rule maybe. Saying all right. We want roads, we want buildings, but we're going to see trees is too much, mailboxes are too much, and eventually we'll have added all the roads and all the buildings and all the addresses, and we just want to maintain what we've got. This could be a whole scenario or two. And it could look like a garden. I'm talking metaphorically about gardens, I just happen to be taking a lot of screen shots of gardens in OSM. Some of them are pretty awesome. Here's one of a really nice well mapped river bank this river is going to move every year. So even if we decided to not move in OSM, we would need a community of people who are really into maintaining it, changing all the river banks if this garden added a new parking lot for the visitors area, we would have to add that, those types of things, but not adding the new types of features. And if you have any other good gardens in OSM, please send them to me. I'm collecting them. So what if the reverse happens, though? What if we just keep adding new things? Adding more and more detail, but we're not keeping up with the maintenance? We start to add every tree. And we get bored of adding all the trees, we start adding all the blades of grass. A great talk about a gardening metaphor at State of the Map in 2011, his slides are online, it's pretty cool, and he talked about, yeah, one of the most insane things you could imagine would be adding every blade of grass in OSM, but I'm sure you could imagine something more insane than that. So why am I calling this the Borges map? Some of you in cartography or photographer have heard of this, but some of you who haven't heard of these types of stories, the short story is actually one paragraph long. That's basically it. That is a maid up quote from some historical book that he imagined existed. About this ancient empire where the geographers and cartographers had a map that basically everything on the map it was a one-to-one scale map and that's completely useless because it has to be as big as the world. So in the story basically they succeeded, and it was useless and the map I was the just kind of falls apart and is a wasteland, and you see -- you can still find parts of that one to one map out in the desert somewhere. So what would that look like in OSM? We're still not nearly near -- at the point of adding detail I think. Here is a demo of it made from Civic data. This is not from OSM. But it's an example of what if we mapped the edges of every road? That would be really cool for self-driving cars, things like that. Maybe we'll do that. Maybe we're going to have enough time on our hands to do that or the future of OSM. We already have area features for pedestrians ways like in the city center where you have a road area that is large enough to be represented as polygon. We could extend that to basically everything. But you could go further. Again, this is not in after my, this is an example of some city data from Cambridge, Massachusetts. And they have a GIS file of every painting mark on the road. We can do this in OSM, and it might be really fun. But it's totally going to get obscene lead. If they repaint that road, hopefully somebody is watching OSM and knows to change the paint marks in the OSM database. It would be really hard to keep this up to date. Okay. So finally what if you keep adding new stuff, but what if you somehow find away to maintain it? And next thing the metaphor is the terminology a little bit here, I'm not talking about cosmological singularity like a black hole, what if computation and information overload just keeps expanding to the point where we can't mentally keep up with anymore? We already know what that would look like. Past the point of prediction. That's the singularity. Maybe it's possible in OSM. Maybe we can have every blade of grass mapped, and we can keep it up to date. Probably not, but that's another kind of possible end point of the scenarios. So what if we actually have in OSM data, this is the real planet file and after first few slow years almost no difference between 2006 and 2007, we start to see some activity. And then basically from 2010 on, we're kind of maintaining the same ratio of edits. This spike here where there's a bunch of early maintenance is apparently from early version of editor, which was doing live edits in a database. So if you're modifying a feature, you might have just moved it a little bit on the map, and you've created version 10 of that feature because every movement created a modification. So here's London, it looks like the dots are slowing down a little bit. Is London getting finished? I'm not sure. Here's Berlin, those last three dots look like 2014, 2015, 2016, they look like they're speeding up in Berlin, but these are examples of really well mapped cities in OSM. Here's an example of Tokyo. So as a geographer, looking at that one chart of the whole world of that entire planet file doesn't Tulsa that much. So I'm interested in seeing what do the different parts of the cities, different parts of the world look like? And they look very different. So Tokyo is a lot fatter here. They're adding more stuff, they're not doing as much maintenance. Maybe this is more of a Borgesian map. And places of hot activations like Haiti some years there's no activity and then a jump in a year and then a jump in a year. This is a fingerprint of a community that has bursts of activity but might be sputtering at other times. And San Francisco where you have a big jump and then you see the actual amount of activity is quite significant and totally examples that tiger activity. And then we see like Moscow. I was really surprised by this one because it's very steady. There's a very clear pattern to activity in Moscow, and it's different from everywhere else. They do a lot more maintenance as a proportion of the new features they're adding. Why are they different? Is there something cultural about the OSM community in question? I don't know yet. And finally I don't even know if this is at all tenable, but I was, like, okay. What are we looking for? What is a actual value of maintenance that we want to see? We need to do maintenance when people add new features that have errors. So every time there is a new feature in OSM and there's got to be some natural human error or error rate, we have to maintain those. Also everything that's in OSM, they reflect something in the real world when the things in the real world change, we have to maintain those. So there's an error rate of adding new things, there's the change rate of things in the real world. Those two factors should combine to a number of gardening edits. What would that look like? Can we figure out a way to say this OSM community is healthy, this one doesn't have enough maintenance, this one is not. Still at the beginning of asking those kinds of questions. And I would invite you to please follow along with these slides at STAMEN.com, hit me up on Twitter and what does it tell us about OSM? Thank you. [Applause] I used up most of my time. I can do one or two questions. He's coming around with the microphone. >> I can shout. >> Yeah. Start shouting. >> Do you plan to look at things that either, you know, regional or a country or even a continental scale versus just the city scales? >> Yeah, could I and would I look at it at different scales than just cities? I would love to. And right now I'm just trying to finish my dissertation without changing my. [Laughter] Research technique. But basically the new stuff that's coming out in the last year or two of, like, Mapbox for analysis, stuff that has really been happening the last few years since I started this. It would be great. I would love to have people take these approaches and apply it to more flexible geographies. It would be amazing. >> I'm from Columbia, how do you think it affects this code, from mobile application, for example, here this about map contribution from the phones and the map? >> Yeah. So how mobile contributions make a difference? I think the way the edit will make a big difference in terms of how much we can keep up with OSM. I think the way we as humans are working with box that are operating OSM, the way we're using things like the sign detection from map, how much we can augment our activity will probably go a long way to do these maintenance tasks. How can we create visualizations that will tell us where our problem so that we can go and find it or prompt us on our mobile phone, like, this doesn't look like based on those left turn you just made. So I think that stuff is happening right now, and it's going to become more important. So when we think about things like the singularity of human intelligence and computer intelligence, that's going to be an important question going forward in OSM. What does it mean to be a human editor versus an augmented editor or bot? What does it mean to look at signals from someone else's mobile phone tracking data? Are they contributing to OSM or neither one looking at their data and adding to the database? I think those are very important questions. >> I just had a question, a lot of the things you talk about were really urban density areas, I'm wondering about the opposite like national parks, city parks, things that have special nodes, trails, parking lots, to see maybe the data is more accurate and doesn't require as much editing as opposed to maybe street names that change or address that change or businesses if you could comment if you thought about that or researched anything. >> You mean things that are popular like city park where a lot of people are going there? >> Yeah. But may not have a density people live there. And I think the U.S. big national parks, or important landmarks that should be tagged, important trails, restrooms, parking lots, visitor centers that are pretty standard and stationary and may not change names often or locations or hours depending on the specific node that I can see more depending on others that may change more frequently. >> Yeah. And that's definitely the differently quality of OSM across different places for demographic reasons or it was just places that are more visited is usually important and something that I'm not really able to control for here. But, yeah, if I can generalize this technique to look at specific types of nodes like our park benches maintained more frequently than wastebaskets in a park. What types of things are more prone to being obsolete or need focus? Or things we can trust more from OSM because we know they would be maintained more often. That would be really important to see -- and to see what these waves of maintenance are. Like, when you add a certain type of feature, do we expect that then that will always -- park will come after that, you know? The cause and effect will be really interesting to see if we can tease out some of these numbers based on the type of edit and the location. All right. Thank you. [Applause]